Imposing Hierarchical Browsing Structures onto Spoken Documents
نویسندگان
چکیده
This paper studies the problem of imposing a known hierarchical structure onto an unstructured spoken document, aiming to help browse such archives. We formulate our solutions within a dynamic-programming-based alignment framework and use minimum errorrate training to combine a number of global and hierarchical constraints. This pragmatic approach is computationally efficient. Results show that it outperforms a baseline that ignores the hierarchical and global features and the improvement is consistent on transcripts with different WERs. Directly imposing such hierarchical structures onto raw speech without using transcripts yields competitive results.
منابع مشابه
A Normalized-Cut Alignment Model for Mapping Hierarchical Semantic Structures onto Spoken Documents
We propose a normalized-cut model for the problem of aligning a known hierarchical browsing structure, e.g., electronic slides of lecture recordings, with the sequential transcripts of the corresponding spoken documents, with the aim to help index and access the latter. This model optimizes a normalizedcut graph-partitioning criterion and considers local tree constraints at the same time. The e...
متن کاملHierarchical topic organization and visual presentation of spoken documents using probabilistic latent semantic analysis (PLSA) for efficient retrieval/browsing applications
The most attractive form of future network content will be multi-media including speech information, and such speech information usually carries the core concepts for the content. As a result, the spoken documents associated with the multi-media content very possibly can serve as the key for retrieval and browsing. This paper presents a new approach of hierarchical topic organization and visual...
متن کاملIndexing Spoken Documents with Hierarchical Semantic Structures: Semantic Tree-to-string Alignment Models
This paper addresses a semantic tree-tostring alignment problem: indexing spoken documents with known hierarchical semantic structures, with the goal to help index and access such archives. We propose and study a number of alignment models of different modeling capabilities and time complexities to provide a comprehensive understanding of these unsupervised models and hence the problem itself.
متن کاملOrganization of Gatekeeping and Mental Framework in the System of Representation and Hierarchical Relational Structures of the Modern Society
Critical discourse analysis as a type of social practice reveals how linguistic choices enable speakers to manipulate the realizations of agency and power in the representation of action.The present study examines the relationship between language and ideology and explores how such a relationship is represented in the analysis of spoken text and to show how declarative knowledge, beliefs, attit...
متن کاملMulti-layered Summarization of Spo Information Extraction and S
The spoken documents are very difficult to be shown on the screen, and very difficult to retrieve and browse. It is therefore important to develop technologies to summarize the entire archives of the huge quantities of spoken documents in the network content to help the user in browsing and retrieval. In this paper we propose a complete set of multi-layered technologies to handle at least some ...
متن کامل